Hardware assisted performance state management based on processor state changes

ABSTRACT

A processor is configured to support a plurality of performance states and idle states. The processor includes a first programmable location associated with a first idle state and configured to store first entry performance state (P-State) information. The first entry P-State information identifies a first entry P-State. The processor is configured to receive a request to enter the first idle state, retrieve the first entry P-State information and enter the first entry P-State. The processor may include a second programmable location associated with the first idle state and configured to store first exit P-State information. The first exit P-State information identifies a first exit P-State. The processor may be configured to receive a request to exit the first idle state, retrieve the first exit P-State information and enter the first exit P-State.

FIELD OF INVENTION

This invention relates to processor power control apparatus and methodsand in particular relates to apparatus and methods to manage processorperformance states.

BACKGROUND

The Advanced Configuration and Power Interface (ACPI) specificationprovides a standard for operating system-centric device configurationand power management. The ACPI specification defines various “states” aslevels of power usage and/or features availability. ACPI states include:global states (e.g., G0-G3), device states (e.g., D0-D3), processorstates (e.g., C0-C3) and performance states (e.g., P0-Pn). The operatingsystem and/or a user may select a desired processor state and a desiredperformance state in an effort to save power. However, there is nolinkage between processor states and performance states. Under someconditions, there are often marginal power savings since a highperformance state will require a high processor frequency and corevoltage despite a request to enter an idle processor state.

SUMMARY OF EMBODIMENTS OF THE INVENTION

A processor is configured to support a plurality of performance statesand idle states. The processor includes a first programmable locationassociated with a first idle state and configured to store first entryperformance state (P-State) information. The first entry P-Stateinformation identifies a first entry P-State. The processor isconfigured to receive a request to enter the first idle state, retrievethe first entry P-State information and enter the first entry P-State.The processor may include a second programmable location associated withthe first idle state and configured to store first exit P-Stateinformation. The first exit P-State information identifies a first exitP-State. The processor may be configured to receive a request to exitthe first idle state, retrieve the first exit P-State information andenter the first exit P-State.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram illustrating a variety of Advanced Configuration andPower Interface (ACPI) states;

FIG. 2 is a block diagram showing the linkage between the variousC-States and P-States;

FIG. 3 is a block diagram of an example multi-core processor with aplurality of programmable storage locations for performance statemanagement; and

FIG. 4 is a block diagram of a processor with multiple banks ofprogrammable locations.

DETAILED DESCRIPTION OF THE EMBODIMENTS

FIG. 1 shows a diagram illustrating a variety of Advanced Configurationand Power Interface (ACPI) states. It should be understood that thetechniques disclosed herein may also be applied to other power statestandards or configurations. ACPI global states generally divide powerusage and system response into a plurality states including globalstates or “G-States” (e.g., G0-G3) as shown in Table 1 below.

TABLE 1 Global State Description Legacy Prior to booting ACPI enabledoperating system G0 (S0) Working G1 (S1-S4) Sleep G2 (S5) Soft Off G3Mechanical Off

Global state G1 is further subdivided into several sleep states, withits own hierarchy of response and power savings. For example S3 iscommonly referred to as suspend to RAM and S4 is commonly referred to assuspend to disk. As shown in FIG. 1, the system may switch from oneglobal state to another. Typically, a system is in Legacy state prior tobooting an ACPI enabled operating system. Once an ACPI enable operatingsystem is booted, the system enters the G0 state. The operating systemor the user may initiate a transition into the G1 (sleep) and G2 (softoff) states via a Bios routine as shown by block 28. The G3 state isdefined as mechanical off, i.e., the power cord may be removed.

A plurality of device states or “D-States” (e.g., D0-D3) are alsoprovided. A device state may be associated with a plurality of devices,i.e., each device has its own associated device state. In this example,a CD/DVD drive 22, hard disk drive 24 and generic “other” device 26 areshown. Device states are generally defined as shown in Table 2 below:

TABLE 2 Device State Description D0 Operating D1 Intermediatestate—varies by device D2 Intermediate state—varies by device D3 Off

processor states or “C-States” are defined as shown in Table 3 below:

TABLE 3 processor State Description C0 Operating C1 Halt C2 Stop-ClockC3 Sleep . . . Cn Nth C-State

It should be understood that for multiple core processors, each core mayhave an associated C-State. During normal operation, a processor core isin the operating state “C0” and the processor core processesinstructions normally. The lower C-States (C1, C2 . . . Cn) are referredto as “idle states.” System performance may depend on the selectedperformance state as discussed below. A system in the C1 state (Halt)does not execute instructions, but may return to an executing stateessentially instantaneously. The C1 state has the lowest latency. Thehardware latency in this state is low enough that the operating systemdoes not consider the latency aspect of the state when deciding whetherto use it.

In the C2 state (Stop-Clock) the processor core is not executinginstructions, but will typically take longer to wake up compared to theC1 state. The C2 state offers improved power savings over the C1 state.The worst-case hardware latency for this state is provided via the ACPIsystem firmware and the operating system may use this information todetermine when the C1 state should be used instead of the C2 state. Inthe C3 state (Sleep), the processor core does not need to keep its cachecoherent, but maintains other state information. The C3 state offersimproved power savings over the C1 and C2 states. The worst-casehardware latency for this state is provided via the ACPI system firmwareand the operating system may use this information to determine when theC2 state should be used instead of the C3 state. While in the C3 state,the processor's caches maintain state but ignore any snoops. Theoperating system is responsible for ensuring that the caches maintaincoherency. It should be understood that additional C-States may bedefined without departing from the scope of this disclosure.

A plurality of performance states or “P-States” are defined as shown inTable 4 below:

TABLE 4 Performance State Description P0 Maximum Power (Pmax) P1Intermediate Performance P0 > P1 P2 Intermediate Performance P1 > P2 P3Intermediate Performance P2 > P3 P4 Intermediate Performance P3 > P4 P5Intermediate Performance P4 > P5 P6 Intermediate Performance P5 > P6 . .. Pn Lowest Performance (Pmin) P(n − 1) > Pn

While a given processor core operates in the C0 state, it may be in oneof several performance states P0-Pn. It should be understood that eachprocessor core may have its own clock source and core voltage source. Inthe alternative, some processor cores may share a common clock sourceand/or core voltage source. Performance states areimplementation-dependent. P0 is the highest-performance state. P1-Pn aresuccessively lower-performance states. Typically n is no greater than16. Each P-state is associated with a processor core operating frequencyand core voltage, e.g., V_(core).

In most cases, the various states discussed above are controlled by theoperating system. With respect to P-States, the operating systemtypically runs a scheduling routine. Over time, the operating systemtracks performance level as well as the desired or required performancelevel for each processor core. The current performance level isperiodically compared to the desired level. For example, an operatingsystem may run a performance state scheduler every 100 ms. Theperformance state may be adjusted as needed based on the results of thecomparison.

With respect to C-States, the operating system may use a task schedulerto determine whether there are tasks that require execution. If tasksare present, the operating state (C0) is selected, or remains selected.If no tasks are present, an idle state is selected. In some cases, anidle state may be selected following user input, e.g., terminating anapplication.

The operating system may maintain a set of entry and exit times for eachidle state as shown in Table 5 below:

TABLE 5 C-State Entry Time Exit Time C1 T_(c1e) T_(c1ex) C2 T_(c2e)T_(c2ex) C3 T_(c3e) T_(c3ex) . . . Cn T_(cne) T_(cnex)

The operating system may use the entry and exit times to determine whichidle state is appropriate. For example, if only a minimal wakeup-updelay is required then C1 is selected. In the alternative, if a longerdelay may be tolerated, one of the lower C-States may be selected.

From the operating system's perspective, there is no logicalrelationship between C-States and P-States. This may limit powersavings. Assume for example, that a given processor core is operating inthe P0 state, (highest performance). Also assume that the task schedulerdetermines that no tasks require execution on the processor core and theoperating system selects a C2 state. Under these conditions, the P statedictates that processor core use the highest frequency and core voltage.Even though an idle state is selected, power savings are limited sincedue to the high clock rate and core voltage.

In order to address this problem, the processor core may override theoperating system and use a P-State that yields increased power savingswithout unduly effecting latency. FIG. 2 is a block diagram showing thelinkage between the various C-States and P-States. In a typicalscenario, the operating system will request a transition to an idlestate for a given processor core. If the original P-State, e.g., asselected by the operating system, is higher than needed, the processorcore may override the original P-State and use a lower P-State. When theoperating system requests a pop-up, the processor core may enter the C0state and restore the original P-State. Such P-State overrides arecarried out independently by the processor core, the operating systemremains unaware of any P-State overrides.

Each processor core generally has known idle state entry times and exittimes. These entry and exit times will generally meet or exceed thevalues in Table 5 above. In order to maximize power savings, the optimalentry and exit P-State for each idle state may be measured, e.g., viaactual measurements or simulations. For example, the entry times foreach idle state may be tested for all possible P-States. Similarly, theexit times for each idle state may be tested for all possible P-States.Using this timing information, the optimal P-State upon entry and exitof each idle state may be selected to meet or exceed the timingrequirements in Table 5 above and also yield the highest power savings.The optimal entry and exit P-states may then be stored in associationwith each idle state.

Each processor core may maintain a programmable location for the optimalentry and exit P-State for each idle state as shown in Table 6 below:

TABLE 6 C-State Entry P-State Exit P-State C1 P_(c1e) P_(c1ex) C2P_(c2e) P_(c2ex) C3 P_(c3e) P_(c3ex) . . . Cn P_(cne) P_(cnex)

FIG. 3 shows an example multi-core processor 30. In this example, fourcores 32, 34, 36 and 38 are shown. It should be understood that a singlecore processor or a multi-core processor with any number of cores may beused without departing from the scope of this disclosure. Each processorcore 32, 34, 36 and 38 may have a bank of programmable locations 40 a,40 b, 40 c, 40 d that contain the relationship between the various idlestates 42 a-46 a, 42 b-46 b, 42 c-46 c and 42 d-46 d and theirassociated entry and exit P-States. Each bank of programmable locations40 a, 40 b, 40 c, 40 d may be generally static or may be updated asneeded. For example, the Bios may be configured to update the values ineach bank of programmable locations 40 a, 40 b, 40 c, 40 d based onspecific operating conditions (e.g., desktop, laptop, client, server . .. )

In this example, each processor core 32, 34, 36, 38 has a programmablelocation for the entry P-State 52 a, 52 b, 52 c, 52 d to be used forentry into the C1 state and for the exit P-State 62 a, 62 b, 62 c, 62 dto be used for exit from the C1 state. It should be understood that aprocessor core may be implemented with programmable locations for entryP-States only or exit P-States only. It should also be understood thatany number of C-States may be supported.

Continuing with this example, each processor core also has programmablelocation for the entry P-State 54 a, 54 b, 54 c, 54 d to be used forentry into the C2 state and for the exit P-State 64 a, 64 b, 64 c, 64 dto be used exit from the C1 state. It should be understood that a largenumber of C-States may be supported. Accordingly, FIG. 3 also showsprogrammable locations for entry P-State 56 a, 56 b, 56 c, 56 d to beused for entry into the Cn state and for exit P-States 66 a, 66 b, 66 c,66 d to be used exit from the Cn state. Each processor 32, 32, 38, 38may also include a programmable location 48 a, 48 b, 48 c, 48 d forstorage of the original P-State so that processor core 32, 34, 36 and 38may restore the original P-State selected by the operating system uponpop-up to the C0 state.

FIG. 4 shows an example in which a processor core may optionally supportmultiple banks of programmable locations. Each bank may include one ormore programmable locations. In this example, the processor core 70includes a plurality of banks 72, 74, 76, 78 each configured to storethe relationship between the various idle states and their associatedentry and exit P-States. Each bank may be associated with a specificpower profile. For example, Bank 1 (72) may be associated an AC powerprofile. Bank 2 (74) may be associated with a first battery poweredprofile. Bank 3 (76) may be associated with a second battery poweredprofile. It should be understood that a large number of banks may beprovided. Bank n (78) may be associated with yet another power profile.It should be understood that power profiles need not be associated witha power source type. For example, a given power profile may beassociated with a device type or usage type (e.g., desktop, laptop,server . . . ). Each bank may have a unique set of values for entry andexit P-States for each supported idle state.

It should be understood that many variations are possible based on thedisclosure herein. Although features and elements are described above inparticular combinations, each feature or element may be used alonewithout the other features and elements or in various combinations withor without other features and elements. The methods or flow chartsprovided herein may be implemented in a computer program, software, orfirmware incorporated in a computer-readable storage medium forexecution by a general purpose computer or a processor. Examples ofcomputer-readable storage mediums include a read only memory (ROM), arandom access memory (RAM), a register, cache memory, semiconductormemory devices, magnetic media such as internal hard disks and removabledisks, magneto-optical media, and optical media such as CD-ROM disks,and digital versatile disks (DVDs).

Suitable processors include, by way of example, a general purposeprocessor, a special purpose processor, a conventional processor, adigital signal processor (DSP), a plurality of microprocessors, one ormore microprocessors in association with a DSP core, a controller, amicrocontroller, Application Specific Integrated Circuits (ASICs), FieldProgrammable Gate Arrays (FPGAs) circuits, any other type of integratedcircuit (IC), and/or a state machine. Such processors may bemanufactured by configuring a manufacturing process using the results ofprocessed hardware description language (HDL) instructions and otherintermediary data including netlists (such instructions capable of beingstored on a computer readable media). The results of such processing maybe maskworks that are then used in a semiconductor manufacturing processto manufacture a processor which implements aspects of the presentinvention.

1. A processor configured to support a plurality of performance statesand idle states, the processor comprising: a first programmable locationassociated with a first idle state and configured to store first entryperformance state (P-State) information, the first entry P-Stateinformation identifying a first entry P-State; the processor beingconfigured to retrieve the first entry P-State information and enter thefirst entry P-State responsive to a request to enter the first idlestate.
 2. The processor of claim 1 wherein the processor includes aplurality of processor cores.
 3. The processor of claim 1 furthercomprising: a second programmable location associated with the firstidle state and configured to store first exit P-State information, thefirst exit P-State information identifying a first exit P-State; theprocessor being configured to retrieve the first exit P-Stateinformation and enter the first exit P-State responsive to a request toexit the first idle state.
 4. The processor of claim 1 furthercomprising: a third programmable location configured to store originalP-State information, the original P-State information identifying anoriginal operating system selected P-State; the processor beingconfigured to exit the first idle state and enter the original P-Stateresponsive to a request to exit the first idle state.
 5. The processorof claim 3 wherein the first entry P-State overrides the originalP-State.
 6. The processor of claim 1 further comprising: a fourthprogrammable location associated with a second idle state and configuredto store second entry P-State information; the second entry P-Stateinformation identifying a second entry P-State; the processor beingconfigured to enter the second entry P-State responsive to a request toenter the second idle state responsive to a request to enter the secondidle state.
 7. The processor of claim 5 further comprising: a fifthprogrammable location associated with the second idle state andconfigured to store second exit P-State information; the second exitP-State information identifying a second exit P-State; the processorbeing configured to receive a request to exit the second idle state,retrieve the second exit P-State information and exit the second exitP-State.
 8. The processor of claim 1 further comprising: a first bank ofprogrammable locations configured to store entry P-State informationassociated with a first profile and a second bank of programmablelocations configured to store entry P-State information associated withsecond profile.
 9. The processor of claim 6 wherein the firstprogrammable bank is associated with an alternating current (AC) powerprofile and the second bank is associated with a battery power profile.10. The processor of claim 6 wherein the at least one of the first andsecond bank is associated with a device type or usage type.
 11. Theprocessor of claim 9 wherein the device type identifies at least one ofa client or server device.
 12. A method of supporting a plurality ofperformance states and idle states in a processor, the methodcomprising: storing first entry performance state (P-State) informationin a first programmable location associated with a first idle state; thefirst entry P-State information identifying a first entry P-State; andretrieving the first entry P-State information and entering the firstentry P-State responsive to receiving a request to enter the first idlestate.
 13. The method of claim 12 further comprising: storing first exitP-State information in a second programmable location associated withthe first idle state; the first exit P-State information identifying afirst exit P-State; and retrieving the first exit P-State informationand entering the first exit P-State responsive to receiving a request toexit the first idle state.
 14. The method of claim 12 furthercomprising: storing original P-State information in a third programmablelocation, the original P-State information identifying an originaloperating system selected P-State; and exiting the first idle state andentering the original P-State responsive to a request to exit the firstidle state.
 15. The method of claim 12 wherein the first entry P-Stateoverrides the original P-State.
 16. The method of claim 12 furthercomprising: storing second entry P-State information in a fourthprogrammable location associated with a second idle state; the secondentry P-State information identifying a second entry P-State; receivinga request to enter the second idle state, retrieving the second entryP-State information and entering the second entry P-State.
 17. Themethod of claim 16 further comprising: storing second exit P-Stateinformation in a fifth programmable location associated with the secondidle state; the second exit P-State information identifying a secondexit P-State; retrieving the second exit P-State information andentering the second exit P-State responsive to receiving a request toexit the second idle state.
 18. The method of claim 12 furthercomprising: providing a first bank of programmable locations configuredto store entry P-State information associated with a first profile andproviding a second bank of programmable locations configured to storeentry P-State information associated with second profile.
 19. The methodof claim 18 wherein the first programmable bank is associated with analternating current (AC) power profile and the second bank is associatedwith a battery power profile.
 20. The method of claim 18 wherein the atleast one of the first and second bank is associated with a device typeor usage type.
 21. The method of claim 18 wherein the device typeidentifies at least one of a client or server device.
 22. A computerreadable media including hardware design code stored thereon, and whenprocessed generates other intermediary data to create mask works for aprocessor that is configured to perform a method of supporting aplurality of performance states and idle states, the method comprising:storing first entry performance state (P-State) information in a firstprogrammable location associated with a first idle state; the firstentry P-State information identifying a first entry P-State; andretrieving the first entry P-State information and entering the firstentry P-State responsive to receiving a request to exit the second idlestate.
 23. The method of claim 22 further comprising: storing first exitP-State information in a second programmable location associated withthe first idle state; the first exit P-State information identifying afirst exit P-State; and retrieving the first exit P-State informationand entering the first exit P-State responsive to receiving a request toexit the first idle state.
 24. The method of claim 22 furthercomprising: storing original P-State information in a third programmablelocation, the original P-State information identifying an originaloperating system selected P-State; and exiting the first idle state andentering the original P-State responsive to a request to exit the firstidle state.
 25. The method of claim 22 wherein the first entry P-Stateoverrides the original P-State.
 26. The method of claim 22 furthercomprising: storing second entry P-State information in a fourthprogrammable location associated with a second idle state; the secondentry P-State information identifying a second entry P-State; receivinga request to enter the second idle state, retrieving the second entryP-State information and entering the second entry P-State.
 27. Themethod of claim 26 further comprising: storing second exit P-Stateinformation in a fifth programmable location associated with the secondidle state; the second exit P-State information identifying a secondexit P-State; retrieving the second exit P-State information andentering the second exit P-State responsive to receiving a request toexit the second idle state.
 28. The method of claim 22 furthercomprising: providing a first bank of programmable locations configuredto store entry P-State information associated with a first profile andproviding a second bank of programmable locations configured to storeentry P-State information associated with second profile.
 29. The methodof claim 28 wherein the first programmable bank is associated with analternating current (AC) power profile and the second bank is associatedwith a battery power profile.
 30. The method of claim 28 wherein the atleast one of the first and second bank is associated with a device typeor usage type.
 31. The method of claim 28 wherein the device typeidentifies at least one of a client or server device.